N -Step PageRank for Web Search
نویسندگان
چکیده
PageRank has been widely used to measure the importance of web pages based on their interconnections in the web graph. Mathematically speaking, PageRank can be explained using a Markov random walk model, in which only the direct outlinks of a page contribute to its transition probability. In this paper, we propose improving the PageRank algorithm by looking N-step ahead when constructing the transition probability matrix. The motivation comes from the similar “looking N-step ahead” strategy that is successfully used in computer chess. Specifically, we assume that if the random surfer knows the N-step outlinks of each web page, he/she can make a better decision on choosing which page to navigate for the next time. It is clear that the classical PageRank algorithm is a special case of our proposed N-step PageRank method. Experimental results on the dataset of TREC Web track show that our proposed algorithm can boost the search accuracy of classical PageRank by more than 15% in terms of mean average precision.
منابع مشابه
Towards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore
Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...
متن کاملSome Recent Results on Ranking Webpages and Websites
In this paper we briefly review some of our recent results on the research of the design and analysis of search engine algorithms. The contents include: the limiting behavior of PageRank when the damping factor tends to 1; comparison of the convergence rate of maximal and minimal irreducible Markov chains on the Internet; a new proposal of N -step PageRank algorithm; a new proposal of ranking W...
متن کاملGoogle PageRank as mean playing time for pinball on the reverse web
It is known that the output from Google’s PageRank algorithm may be interpreted as (a) the limiting value of a linear recurrence relation that is motivated by interpreting links as votes of confidence, and (b) the invariant measure of a teleporting random walk that follows links except for occasional uniform jumps. Here, we show that, for a sufficiently frequent jump rate, the PageRank score ma...
متن کاملAsymptotic Analysis for Personalized Web Search
Personalized PageRank is used in Web search as an importance measure for Web documents. The goal of this paper is to characterize the tail behavior of the PageRank distribution in the Web and other complex networks characterized by power laws. To this end, we model the PageRank as a solution of a stochastic equation R d = ∑ N i=1 AiRi + B, where Ri’s are distributed as R. This equation is inspi...
متن کاملTopic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search
The original PageRank algorithm for improving the ranking of search-query results computes a single vector, using the link structure of the Web, to capture the relative “importance” of Web pages, independent of any particular search query. To yield more accurate search results, we propose computing a set of PageRank vectors, biased using a set of representative topics, to capture more accuratel...
متن کامل